NUMERICAL CALCULATION METHOD, 

NUMERICAL CALCULATOR AND 
NUMERICAL CALCULATION PROGRAM 

BACKGROUND OF THE INVENTION 

The present invention relates to a numerical calculation technique using a 
computer. 

In technique to control a physical quantity such as a pressure or a temperature, 
successive approximation algorithm has been conventionally used. A residual cutting 
method is known as highly convergent successive approximation algorithm suitably used 
for rapidly and accurately obtaining a solution of each parameter to be controlled. 
Furthermore, Japanese Laid-Open Patent Publication No. 2001-134304 has already 
described technique to further improve the convergence in the residual cutting method. 

In the successive approximation algorithm such as the residual cutting method, the 
accuracy of a solution is improved by adding a correction quantity to an approximate 
solution. A sequence of past correction quantities is used for calculating the correction 
quantity, and the accuracy of the correction quantity can be improved as the sequence is 
longer. On the other hand, however, as the sequence of past correction quantities is 
longer, a memory region necessary for the calculation is larger and time required for 
obtaining the correction quantity is longer. 

SUMMARY OF THE INVENTION 

An object of the invention is, in numerical calculation using the successive 
approximation algorithm, suppressing a memory region necessary for the calculation and 
shortening calculation time while keeping calculation accuracy. 
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Specifically, the present invention provides a numerical calculation method, a 
numerical calculator or a recording medium that stores a numerical calculation program for 
a physical quantity U by solving AU = f, wherein A is a coefficient matrix (in N rows by 
N columns; wherein N is a positive integer) obtained through discrete of a partial 
5 differential equation to be satisfied by the physical quantity U, and f is an inhomogeneous 
term (source term). In the numerical calculation method, the numerical calculator or the 
recording medium, the processes of setting 0 as an initial value of a number m of repeating 
times, giving 0 as an initial value of a perturbation quantity <f> and setting (f - A • U°) as 
an initial value r° of a residual r; and repeatedly executing a first step and a second step 

10 while incrementing the number m of repeating times until an approximate solution U m is 
converged, and the first step includes the steps of setting an initial value U° of the physical 
quantity U; obtaining a predicted approximate value \\F of A-0 = r m through repeated 
calculation performed by a first calculation unit including an internal solver, and the 
second step includes the steps of: obtaining, from the predicted approximate value \|/ m , a 

15 corrected approximate value </> m for minimizing L 2 norm of a residual r m through an 
optimization routine performed by a second calculation unit; and giving (U m + 0 m ) as an 
approximate solution U m+1 and giving (r™ - A* <f> m ) as a residual r m+1 , and in the second step, 
obtained elements of a vector sequence A- 0 m are sampled by a given sampling method to 
be stored in a memory, and a residual minimization coefficient ai m (wherein 1 = 1, L) 

20 used for obtaining the corrected approximate value 0 m is approximately obtained by using 
elements of a vector sequence A-^ k (wherein k = m - L + 1, . .., m - 1) stored in the 
memory. 

In sampling of elements bi, b2, ... and bw of the vector sequence A-^ m performed 
in the second step, elements b\ (wherein ie£2) are preferably selected, whereas a subset Q 
25 is defined as follows: 



O = {i : mod [i, Ig] = 1 } U {i : Ifi/aiil > P} 
wherein lg is an integer, P is a real number, fi is an element of the source term and an is a 
diagonal term on the ith row in the ith column of the matrix A. 

5 BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a flowchart of a numerical calculation method using the residual cutting 

method; 

FIGS. 2 A and 2B are conceptual diagrams of sampling of elements of a vector 
sequence according to an embodiment of the invention; 
10 FIGS. 3A and 3B are diagrams for showing evaluation results of the invention; 

and 

FIG. 4 is a block diagram for conceptually showing the architecture of a 
numerical calculator of the embodiment. 

1 5 DETAILED DESCRIPTION OF THE INVENTION 

A preferred embodiment of the invention will now be described with reference to 
the accompanying drawings. 

FIG. 1 is a flowchart of a numerical calculation method according to the 
embodiment of the invention. FIG. 4 is a block diagram for showing the architecture of a 
20 numerical calculator 1 according to the embodiment. 
(Formulation of residual cutting equation) 

It is herein assumed that a physical quantity to be obtained is U, that a coefficient 
matrix (in N rows by N columns, wherein N is a positive integer) obtained through discrete 
of a partial differential equation to be satisfied by the physical quantity U is A and that an 
25 inhomogeneous term (source term) is f. Under these conditions, an equation to be solved 
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is: 

A-U = f ... (1) 

This equation can be generally solved by using successive approximation such as the SOR 
method or the ADI method. 
5 In the present invention, the following solution method is employed: 

A solution U°° of the equation (1) is represented by using an approximate solution 
U and a perturbation quantity (namely, a difference from a true solution) cj> as follows: 

U°° = U+ <f> ... (2) 

In the solution method of this invention, the true solution of the equation (1) is 
10 obtained through correction of the approximate solution U by using the obtained 
perturbation quantity <f> . 

At this point, a residual r of the approximate solution U is defined as follows: 
r = f-A-U ... (3) 
On the basis of the equations (1) through (3), the following is obtained: 
15 A(U + '4>) = f 

/. A</> =f-AU 
= r 

Accordingly, in order to obtain the perturbation quantity <f>, the following equation is 
solved: 

20 A^ =r ... (4) 

(Algorithm for residual cutting method) 

In the algorithm employed in the invention, a convergence solution of the 
equation (4) is not to be obtained but a predicted approximate value \\f is obtained through 
a minimum unit of repeated calculations with the steepest convergence gradient by the 
25 ADI method or the like, and the thus obtained predicted approximate value \\t is supplied to 
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an optimization control routine. The calculation of the predicted approximate value \|/ is 
repeatedly executed until the result of the optimization control routine satisfies a 
predetermined condition. 
Optimization control routine> 
5 Assuming that the number of repeating times is m, a synthesize perturbation 

quantity <f> m for minimizing the L 2 norm of the residual and a new approximate solution 
U m+1 are defined as follows: 

L 

<f> =a x y/ +2^ a i 9 ■■• ( 5 ) 

1=2 - - - 

U m+] =U m +<f> m ... (6) 
10 wherein ai m (1 = 1, 2, 3, L) is a residual minimization coefficient and is a constant 
obtained through calculation described later. The residual is minimized so as to reduce 
the L 2 norm of the residual as follows: 

When the new approximate solution U m+1 is represented by the equation (6), a 
residual r m+1 of this approximate solution is represented as follows on the basis of the 
15 equation (3): 

r^ +1 =f-A-U m+1 ... (7) 
When the equation (7) is substituted in the equation (6) and the equations (3) and (5) are 
used, the following is obtained: 

r -+i =r m - a™ A • x// m - Y^a^A . <f> m ~ ux ... (8) 

1=2 

20 The L 2 norm ||r m+1 || of the residual r m+1 of the approximate solution U m+1 is given 

by the following equation (9) as a square root of a sum of all points of (r m+1 ) 2 : 

lh m 1 = Jd>" - < A ■ v" -1L< A ■ t m - M ?) ... (9) 

V i 1=2 
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For minimizing the L 2 norm ||r m+1 || of the residual i m+l , the residual minimization 
coefficient ai m (1 = 1 through L) of the equation (9) is determined by the least square 
method as follows: 

d/da t m (\r m+} f) = 0 
5 1 = 1,2,3,. ..,L ... (10) 

The equations (10) are simultaneous equations of L unknowns with respect to the 
coefficient ai m , and when these equations are numerically solved, the coefficient ai m can be 
defined. When the coefficient ai m is defined, the synthesize perturbation quantity (f> m is 
obtained on the basis of the equation (5) and the approximate solution U m+1 is obtained on 
10 the basis of the equation (6). The calculation method for the residual minimization 
coefficient a™ will be described later. 

While incrementing the number m, similar routines are repeated, so as to attain 
convergence of the solution U°° for making zero or minimizing the L 2 norm of the residual 
of the equation (9). 

15 Now, the numerical calculation method according to this embodiment will be 

described with reference to the flowchart shown in FIG. 1. This numerical calculation 
method can be practiced by allowing a computer to execute a program for realizing this 
numerical calculation method that is recorded in a recording medium such as a CD-ROM 2. 
First, in step SI, an initial value U° of the physical quantity U to be obtained is set. 
20 Then, in step S2, 0 (zero) is given as the initial value of the number m of repeating times, 0 
(zero) is given as the initial value of the perturbation quantity <f>, and the initial value r° of 
the residual r is set to (f - AU°). 

Thereafter, procedures of steps S3 through S7 are repeatedly executed with the 
number m of repeating times incremented until the approximate solution U m is converged. 
25 In steps S3 through S6, an approximate solution (predicted approximate value) \|/ m 
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of the following equation is obtained by using a first calculation unit 10 including an 
internal solver 11 that executes the successive approximation such as the ADI method, the 
SOR method or the CG method: 
A-^=r m 

5 However, when this equation is to be solved by the existing successive approximation, it is 
necessary to repeat the calculation too many times for practical use, and hence, the 
calculation unavoidably takes a massive time. Therefore, in this embodiment, the upper 
limit N of the number of repeating times of the successive calculation is set (in step S5). 

When the predicted approximate value \|/° is obtained through steps S3 through S6, 
10 this predicted approximate value vj/° is used, in step S7, for obtaining a corrected 
approximate value </>° for minimizing a next residual r 1 through the optimization routine 
executed by a second calculation unit 20. On the basis of the equations (2) and (3), this 
corrected approximate value ^° is used for obtaining a first-order approximate value U 1 of 
the physical quantity and a first-order residual r 1 by the following equations: 
15 Ij'-U 0 * <j>° ... (12) 

r'^f-AU 1 ... (13) 
When the equation (12) is substituted in the equation (13), 
r^f-A^ 0 * 0°) 
= f- AU°- A0° 
20 =r°-A^° ... (14) 

Thus, the first-order residual r 1 can be obtained by the equation (14). 

The number m of repeating times is incremented in step S9, and the flow returns 
to step S3 for repeating similar calculation. When such procedures are repeated, (U 2 , r 2 ), 
(U 3 , r 3 ), ... (U m , r m ), ... are successively obtained, and the norm ||r m || of the residual is 
25 successively reduced. 



An equation for obtaining a predicted approximate value \j/ m in step S4 when the 
number of repeating times is m is as follows: 
K</> =r m 

Therefore, by using a corrected approximate value <f> m optimized through the optimization 
5 control routine, the followings are obtained in step S7: 

U m+1 =U m + </> m 

^ = r m - A0 m 
(Method for obtaining residual minimization coefficient ai) 

Herein, 5 is introduced as follows for simplification: 

10 5 m = V n 

gm-i+i = ^m-i+i^ wherein l = 2, 3, L 

In this case, the corrected approximate value <f> 171 is represented as follows: 

L 

<P =Z^ a i 5 

J=\ 

Therefore, 



/=] 
/=i 

When this equation is expressed without using the symbol Z, 

r m + l =T tn_ (a m A6 m + +^"1^-2 + + 

20 The residual minimization coefficient a" 1 can be obtained by using, for example, 

the least square method employing a condition for minimizing the (m+l)th-order residual 



norm ||r m+1 || (a square root of a sum of squares). 
\\r m ( -t^y 

i 

= J^ir m -Y j arA8 n - M f 

5 When this equation is differentiated with respect to a™, the following is obtained: 
d/dar(\r m+v () = 0 

When this equation is solved with respect to a™, the residual minimization coefficient ai m 
can be obtained. The differentiation results in the following: 

d/da;\\r m+ f) = 2^(r m -£ ai m A6* M )-(-AS* v+l ^ 

i i=\ i 

(S m = if/ m S m ~ M = <f> m - I+} (1 = 2, 3, L)) 

wherein 1' = 1, 2, L. When these linear simultaneous equations are solved, the 

residual minimization coefficient a™ can be obtained. 

For example, when L = 3, the above equations are obtained as ternary linear 
15 simultaneous equations as follows: 

a\ m Z(A\\T) 2 + a 2 m 2(A\|/ m A^ m ' 1 ) + a 3 m S(Av|/ m A^ m " 2 ) = ZO^Ay" 1 ) 
ai m Z(A^ mrl A\|/ m ) + a 2 m 2(A^ m1 ) 2 + a3 m 2(A^ m - 1 A^ m ' 2 ) = X(r m A<t> m ' 1 ) 
arZ(A0 m ' 2 Ay m ) + a 2 m 5:(A^ m - 2 A ( zJ m " 1 ) + a 3 m 2( A <j> m ' 2 ) 2 = Z(r m A0 m - 2 ) 

Therefore, 

20 <f> m = ai n y n + a 2 m <t> mA + a 3 m <f> m ~ 2 
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r m+i = r - _ ( ai m Av|/ m + a 2 m Af n - } + a 3 m A^ m - 2 ) 
= r m - A^T 

u m+1 = u m + 0 m 

<Sampling of vector sequence> 
5 As described above, it is necessary to solve the aforementioned linear 

simultaneous equations to obtain the residual minimization coefficient on m In these 
equations, a vector sequence of A-<fi m '\ A-^ m ' 2 , etc. appears frequently. Therefore, in 
this embodiment, the obtained elements of the vector sequence A-0 m are stored in a 
memory 30 in step S7. Thus, the residual minimization coefficient a™ (1 = 1, L) to be 
10 used for obtaining the corrected approximate value <f) m can be obtained by using the 
elements of the vector sequence A*^ k (k = m- L+ 1, m - 1) already stored in the 
memory 30, and hence, the efficiency in the calculation can be improved. 

Furthermore, in this embodiment, a method for storing the vector sequence A- <p m 
devised to save the memory capacity and the calculation time will be proposed. 
15 Specifically, not all the elements bi, b 2 , and bw of the vector sequence A*0 m are stored 
in the memory 30 but part of the elements are sampled to be stored. 

At this point, a subset £2 of a set { 1, 2, . . ., N} is defined as follows: 
£2={i:mod[i, lg] = l} U {i : \fj*z\ > P} 
wherein Ig is an integer parameter, P is a real parameter, an is an element on the ith row in 
20 the ith column, namely, a diagonal term, of the matrix A, is an element of the source term 
f When lg = 1 or p = 0, Q = {1, 2, N}. In storing the elements of the vector 
sequence A-^ m , not all the elements bi, b2, and bw are stored but merely elements 
whose row numbers i are contained in the subset £2, namely, merely elements bi (ie £2), are 
selected to be stored. In other words, the aforementioned inner product calculation is 
25 approximately performed by using a sum of the sampled elements alone. 
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FIGS. 2A and 2B are diagrams for conceptually showing exemplified sampling of 
the elements of the vector sequence A- ^ m . FIG. 2A is a graph for showing the 
distribution of the respective elements fi of the source term f, which is normalized by the 
diagonal term an of the matrix A. FIG. 2B shows the exemplified sampling performed on 
5 the distribution shown in FIG. 2A, wherein an example A shows a sequence obtained when 
all the elements are selected, an example B shows a sequence obtained when elements at 
predetermined intervals (every five elements) are selected, and an example C shows a 
sequence obtained when the elements belonging to the subset Q. are selected as in this 
embodiment. Herein, the value of the real parameter p is set on the basis of the maximum 

10 value of |f|/aii|, and p = 0.98 max (1^1). 

As is understood from FIGS. 2 A and 2B, when the elements belonging to the 
subset £2 are selected as in this embodiment (namely, in the example C), sampled elements 
are obtained as a combination of spatially uniformly sampled elements (shown with a 
symbol A) and so-called locally sampled elements on the basis of the characteristic of the 

15 physical phenomenon (shown with symbols O and ®). It goes without saying that the 
sampling method is not limited to that described herein but spatial sampling similar to the 
example B alone may be employed or local sampling on the basis of the physical 
phenomenon alone may be employed. For example, the subset SI may be defined as 
follows: 

20 Q = {i : mod [i, lg] = 1} or 

Q={i:|fi/aii|>P} 

However, in order to save the memory capacity and shorten the calculation time while 
keeping the calculation accuracy, the combination of the spatial sampling and the local 
sampling on the basis of the physical phenomenon is preferably employed. 
25 <Evaluation of the invention> 4 



In order to evaluate the effect of the invention, numerical calculation of common 
equations is actually executed on a computer by the residual cutting method separately in 
the case where the sampling is not performed (referred to as the case 1) and in the case 
where the sampling is performed by using the above-described subset Q. (referred to as the 
5 case 2). In this evaluation, the equations to be solved are linear simultaneous equations 
obtained through discrete of the Poisson's equation on a three-dimensional lattice by using 
a secondary accuracy difference, the SOR method is employed in the internal solver 11, 
and the number of the residual minimization coefficients a™ is 15 (L = 15). Also, lg = 7, 
p = 0.98 max (|fi/aii|). 

10 FIGS. 3 A and 3B show the evaluation results. In FIG. 3B, the ordinate indicates 

the relative residual ||r||/||f]|, and the abscissa indicates CPU time. It is understood from 
FIGS. 3A and 3B that when the sampling according to this embodiment is employed, the 
calculation time is shortened by approximately 15% and the used memory capacity is 
reduced by approximately 25% as compared with the case where the sampling is not 

15 performed. 

The numerical calculation herein described is usable in a variety of fields. For 
example, it is applicable to analysis technology such as fluid analysis as well as to control 
technology such as driving control for a vehicle. 

As described so far, according to this invention, the residual minimization 

20 coefficient ai m (1 = 1, L) to be used for obtaining the corrected approximate value ^ m 
can be approximately obtained by using the elements of the vector sequence A- 0 k (k = m - 
L + 1, m-1) stored in the memory, and hence, the efficiency in the calculation can be 
improved. In addition, the elements of the vector sequence A- ^ k are sampled to be stored 
in the memory, the memory capacity necessary for the calculation can be suppressed and 

25 the calculation time can be shortened. 



